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ABSTRACT 

Consider  the  string  matciiing  problem,  where  differences  between  characters  of  the 
pattern  and  characters  of  the  text  are  allowed.  Each  difference  is  due  to  either  a 
mismatch  between  a  character  of  the  text  and  a  character  of  the  pattern  or  a 
superfluous  character  m  the  text  or  a  superfluous  character  in  the  pattern.  Given  a 
text  of  length  n.  a  pattern  of  length  m  and  an  mteger  k,  we  present  parallel  and 
serial  algorithms  for  finding  all  occurrences  of  the  pattern  m  the  text  with  at  most  k 
differences.  The  first  part  of  the  parallel  algorithm  consists  of  analysis  of  the 
pattern  and  takes  O(Iogm)  time  using  m"  processors.  The  rest  of  the  algorithm 
consists  of  handling  the  text.  The  text  handling  part  applies  the  following  new- 
approach.  This  part  starts  by  obtaining  a  concise  characterization  of  the  text  which 
is  based  solely  on  substrings  of  the  pattern  m  OUogm)  time  using  n/logm 
processors.  Then  the  desired  output  is  derived  from  this  characterization  together 
with  the  tables  built  in  the  first  part  in  0(k)  time  using  n  processors. 
The  serial  algorithm  follows  also  this  new  approach  for  handling  the  text.  It  runs  m 
0(kn)  time  for  alphabet  whose  size  is  fixed.  For  general  input  the  algorithm 
requires  0(«(/:  -  logm))  time.    In  both  cases  the  space  requirement  is  0(n). 

1.   Introdsiction 

The  problem.    Input.  Two  arrays:  A  =  ai,...,a„  -  the  pattern,  T  =  ti,....t„  -  the  text 
and  an  integer  k  (^1). 

In  the  known  problem  of  pattern  matching  in  strings  (e.g.,  as  discussed  in  [KMP-77])  we 
are  interested  in  finding  all  occurrences  of  the  pattern  in  the  text.  In  the  present  paper  we 
are  interested  in  designing  an  algorithm  that  finds  all  such  occurrences  with  at  most  k 
differences. 

Example.  Let  the  text  be  ahcdefghi  ,  the  pattern  hxdyegh  and  k  =  7).  Let  us  see  whether 
there  is  an  occurrence  with  s  k  differences  that  ends  at  the  eighth  location  of  the  text.  For 
this  we  propose  the  following  correspondence  between  bcdefghi  and  bxdyegh.  1.  b  (of  the 
text)  corresponds  to  ii  (of  the  pattern)  .2.  c  to  .v.  3.  <i  to  d.  4.  Nothing  \.o  y .  5.  e  lo  e .  6.  f  lo 
^nothing.  7.  g  to  g.    8.  h  to  h.    The  correspondence  can  be  illustrated  as 

b X  dye     g  h 
bed     e  f  g  h  i 

In  only  three  places  the  correspondence  is  between  non-equal  characters,  implying  that 
there  is  an  occurrence  of  the  pattern  that  ends  at  the  eighth  location  of  the  text  with  3 
differences  as  required. 


We  distinguish  three  types  of  differences: 

(a)  A  character  of  the  pattern  corresponds  to  a  different  character  of  the  text.  (Item  2  in 
the  Example).  In  this  case  we  say  that  there  is  a  mismatch  between  the  two  characters. 

(b)  A  character  of  the  pattern  corresponds  to  "no  character"  in  the  text.  (Item  4). 

(c)  A  character  of  the  text  corresponds  to  "no  character"  in  the  pattern.  (Item  6). 

We  consider  the  following  problem. 
The  string  matching  with  k-differences  problem.  (In  short,  the  k- differences  problem). 
Find  all  occurrences  of  the  pattern  in  the  text  with  at  most  k  differences  of  type  (a),(b)  and 
fc). 

The  case  /t  =  0  in  the  /t-differences  problem  is  the  extensively  studied  string  matching 
problem.  There  are  a  few  notable  algorithms  for  the  strmg  matching  problem:  linear  time 
serial  algorithms  -  [BM-77],  [GS-83],  [KMP-77],  [KR-80]  (a  randomized  algorithm)  and 
[V-85b],  parallel  algorithms  [G-84]  and  [V-85b]. 

Even  these  parallel  algorithms  for  exact  string  matching  had  to  abandon  their 
preceding  linear  time  serial  algorithms  since  these  serial  algorithms  do  not  seem  amenable 
to  parallelism.  We  note  that  none  of  these  serial  and  parallel  algorithms  is  suitable  to  cope 
with  the  /t-differences  problem.  Moreover,  the  remark  below  explains  why  even  the  way  by 
which  parallelism  is  approached  in  these  parallel  algorithms  is  unlikely  to  be  generalizable 
for  approximate  string  matching. 

Remark.  [G-84]  and  [V-85b]  gave  parallel  algorithms  for  exact  string  matching.  We  give  a 
short  description  of  their  approach  and  explain  why  we  had  difficulties  in  applying  it  for 
the  /t-differences  problem.  The  main  part  in  the  text  handling  parts  of  each  of  these 
algorithms  consists  of  eliminating  many  entries  of  the  text  for  which  occurrences  of  the 
pattern  cannot  start.  This  elimination  process  iterates  the  following  step:  it  picks  a  proper 
pair  of  "close"  entries  which  have  not  yet  been  eliminated.  The  pair  is  proper  in  the  sense 
that,  based  on  information  gathered  :n  the  pattern  analysis,  an  occurrence  may  start  in  at 
most  one  of  these  entries.  Then,  one  of  these  entries  is  eliminated.  This  results  in  a  small 
enough  number  of  remaining  entries  in  uhich  occurrences  may  start.  iThe  final  part  in 
each  of  these  algorithms  is  a  straightforward  procedure  uhich  checks  whether  there  is  an 
occurrence  in  each  of  these  remaining  entries.)  We  see  no  way  for  applying  a  <imilar 
elimiination  process  for  approximate  string  matching  problems.  The  reason  being  that  the 
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differences  which  are  allowed  between  the  pattern  and  the  text  enable  coexistence  of 
seemingly  contradicting  occurrences.  Indeed,  our  solution  is  constructive  in  the  sense  that 
it  finds  all  occurrences  without  a  preceding  stage  in  which  some  entries  in  which  an 
occurrence  is  impossible  are  eliminated. 

The  model  of  computation  used  in  this  paper  is  the  random-access-machine  (RAM) 
[AHU-74j  for  the  serial  algorithm,  and  the  concurrent-read  exclusive-write  (CREW) 
parallel  random  access  machine  (PRAM)  for  the  parallel  algorithm.  A  PRAM  employs  p 
synchronous  processors  all  having  access  to  a  common  memory.  \  CREW  PRAM  allow^s 
simultaneous  access  by  more  than  one  processor  to  the  same  memory  location  for  read  but 
not  for  write  purposes.  See  [V-83j  for  a  survey  of  results  concerning  PRAMs. 

The  A'-differences  problem  is  not  only  a  basic  theoretical  problem.  It  also  has  a  strong 
pragmatical  flavor.  In  practice,  we  often  need  to  analyze  situations  where  the  data  is  not 
completely  reliable.  Specifically,  consider  a  situation  where  the  strings  which  are  the  input 
for  our  problem  contain  errors  and  we  still  need  to  find  all  possible  occurrences  of  the 
pattern  in  the  text  as  in  reality.  The  errors  may  include  a  character  being  replaced  by 
another  character,  a  character  being  omitted  or  a  superfluous  character  being  inserted. 
Assuming  some  bound  on  the  number  of  errors  would  clearly  imply  our  problem.  We  refer 
the  reader  to  [SK-83],  a  book  which  is  essentially  devoted  to  various  instances  of  the  k- 
differences  problem.  The  book  gives  a  comprehensive  review  of  applications  of  the 
problem  in  a  variety  of  fields,  including:  computer  science,  molecular  biology  and  speech 
recognition. 

We  give  a  first  parallel  algorithm  for  the  /.'-differences  problem.  The  algorithm  has 
three  main  parts:  I.  .Analysis  of  the  pattern.  II.  Analysis  of  the  text.  III.  Finding  all 
occurrences  of  the  pattern  in  the  text  w'wh  at  most  k  differences. 

Part  I  processes  the  pattern  only  and  results  in  a  few  tables.  These  tables  contain 
information  on  how  substrings  of  the  pattern  relate  to  other  substrings  of  the  pattern.  In 
principle,  similar  constructs  uere  used  m  early  string  matching  algorithms  like  [KMP-77]. 
Part  I  needs  Oilogm  i  time  using  m'  processors. 

Part  II  processes  the  text  using  the  tables  that  were  built  in  Part  I,  It  results  in  a  table 
which  characterizes  the  whole  text  using  only  substrings  of  the  pattern  as  yardsticks.  Such 
constructs  seem  to  be  new. 
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Part  III  uses  oniy  the  tables  built  in  Parts  I  and  II  in  order  to  derive  the  desired  output.  In 
other  words,  the  characterization  of  the  text  which  was  obtained  in  Part  II  turns  out  to  be 
so  powerful  that  we  do  not  need  to  take  another  look  at  the  text. 

The  complexity  results  for  the  text  handling  parts  (Parts  II  and  III)  demonstrate  that  the 
characterization  obtained  in  Part  II  is  concise:  (a)  The  characterization  can  be  computed 
efficiently  -  Part  II  needs  O(logm)  time  using  n/logm  processors  to  compute  it.  fb)  The 
characterization  provides  for  an  efficient  solution  of  the  A:-differences  problem  -  Part  III 
needs  0(k)  time  using  n  processors  for  finishing  the  solution  of  the  ^'-differences  problem. 

This  present  paper  demonstrates  how  parallel  algorithms  can  enrich  the  field  of  serial 
algorithms.  We  first  discovered  the  parallel  algorithm.  We  then  noticed  that  the  parallel 
algorithm  yields  as  a  byproduct  a  new  serial  algorithm  for  the  ^-differences  problem  which 
is  considerably  simpler.  The  serial  algorithm  runs  in  Oikn)  time  for  alphabet  whose  size  is 
fixed  and  requires  0{n(k  -^  logm))  time  for  general  input.  In  both  cases  the  space 
requirement  is  0(n).  In  [LV-85a],  [LV-85c]  the  authors  give  two  implementations  of  a 
serial  algorithm  for  the  A:-differences  problem.  The  first  one  [LV-85a]  runs  in  Olm'  +  nk^) 
for  general  input  and  requires  O(m')  space.  The  second  one  [LV-85c]  runs  in  0(m  +  k'n) 
time  for  alphabet  whose  size  is  fixed.  For  general  input  the  algorithm  requires 
0(mlogm  +  k'nl  time.  In  both  cases  the  space  requirement  is  Oim).  Our  new  serial 
algorithm  is  faster  than  these  previous  algorithms  when  the  size  of  the  alphabet  is  fixed  or 
for  general  input  when  k^  is  larger  than  logm  by  order  of  magnitude. 

Using  notations  of  the  first  paragraph  of  this  section  we  define  the  k-mismatches 
problem  as  follows.  The  input  is  the  same  as  for  the  /c-differences  problem.  The  problem  is 
to  find  all  occurrences  of  the  pattern  in  the  text  with  at  most  k  differences  of  type  (a). 
[LV-85a],  [LV-85bj  give  an  algorithm  for  the  ^-mismatches  problem.  It  runs  in 
0(k(mlogm+n))  time  for  general  input.  Our  new  serial  algorithm  for  the  /:-differences  can 
handle  also  the  /.--mismatches  problem.  So,  when  the  alphabet  size  is  fixed  or  for  general 
input  when  kmlogm  is  larger  than  niogm  by  order  of  magnitude  then  the  new  algorithm  is 
better  than  [LV-85aj.  [LV-85b]. 

In  the  recent  survey  on  future  directions  for  research  in  string  matching  [G-85i,  the  k- 
mismatches  problem  is  discussed.  .An  open  question  which  is  proposed  m  the  paper  is 
whether  the  simple  (dynamic  programming)  algorithm  for  the  ^.'-mismatches  problem  which 
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takes  Oimn)  serial  time  can  be  improved.  A  similar  Oimn)  time  algorithm  solves  the  k- 
differences  problem.  We  note  that  the  present  paper  answers  affirmatively  this  question 
also  for  the  ^'-differences  problem  which  seems  more  general. 

The  serial  algorithm  is  given  in  Section  2.  In  order  to  make  the  presentation  more 
intuitive  Part  III  of  the  parallel  algorithm  is  described  in  Section  3.  Part  II  in  Section  4  and 
Part  I  in  Section  5. 


2.   The  Serial  Algorithm 

In  this  section  we  give  our  new  serial  algorithm  for  the  ^-differences  problem.  .Asa 
warm  up  we  start  with  two  known  serial  Oimn)  time  algorithms  for  this  problem.  The  first 
one  is  a  simple  dynamic  programming  algorithm.  The  second  algorithm  follows  the  same 
dynamic  programming  computation  in  slightly  different  way  which  will  help  to  understand 
the  new  algorithm.  Subsection  2.3  gives  the  new  serial  algorithm. 

2.1.   The  dynarrjic  programming  algorithm. 

We  use  a  matri.x  D-q      „.q      „].  where  D,  ;  is  the  minimum  number  of  differences  between 

Qi a,  and  any  successive  substring  of  the  text  ending  at  f/. 

It  should  be  obvious  that  if  D„_/  :^  k  then  there  must  be  an  occurrence  of  the  pattern  in  the 
text  with  at  most  k  differences  that  ends  at  f/. 

The  following  algorithm  computes  the  matrix  D[o      m;0      n] 

Initialization       for  all  I,  0  </<  n  .  Dq  i:=  0 
for  all  /,  1  :S  /■  <  m  ,  D,- q  :=  i 

for  i:  =  1  to  m  do 

for  I:-  1  to  n  do 

D,  i:=  min  (D,_;  ■-!- 1.  Z3,  /_;  -  1,  D,_;  ;_-  if  <.7,  =  f/  or  D,_^  ;_■,  ^  1  otherwise,) 

(D,  '  IS  the  minimum  oi  three  numbers.    These  three  numbers  are  obtained  from  the 

predecessors  of  D.  ■  on  its  column,  row  and  diagonal,  respectively) 
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Complexity.   The  algorithm  clearly  runs  in  Oimn)  time. 

2.2.   An  alternative  dynamic  programming  computation. 

The  description  reminds  to  some  extent  [U-83].  It  computes  the  matrix  D,  of  the 
dynamic  programming  algorithm,  using  its  diagonals.  A  diagonal  d  of  the  matrix  consists 
of  all  D,  ,"s  such  that  l-i  =  d. 

For  a  number  of  differences  e  and  a  diagonal  d.  let  L^^  denote  the  largest  row  ;  such  that 
Dii  =  e   and  D,  /   is  on  diagonal  d.    The  definition  of  L^^  ^:learly  implies  that  e   is  the 

minimum  number  of  differences  between  ai a^^^  and  any  substring  of  the  text  ending 

at  tr    _j.    It  also  implies  that  a^    _i  ^  f^^  -j-i-    For  our  ^'-differences  problem  we  need 

only  the  values  oi  L^_/s.  where  e  satisfies  e  s.  k. 

If  one  of  the  L^  /s  equals  m,  for  e^k,  n  means  that  there  is  an  occurrence  of  the  pattern  in 

the  text  with  at  most  k  differences  that  ends  at  t^^„. 

We  compute  the  I^.^'s  by  induction  on  e.    Given  d  and  e  we  show  how  to  compute  L^  ^ 

using    Its    definition.    Suppose    that    for    all   x<e    and    all    diagonals    v    L,  ^    was    already 

computed.   Suppose  L^,  should  get  the  value  i.  That  is.  /  is  the  largest  row  such  that 

Dii  =  e.  and  D,  ,  is  on  the  diagonal  d.    The  algorithm  of  the  previous  subsection  reveals 

that  D,  I  could  have  been  assigned  its  value  e  using  one  (or  more)  of  the  following  data: 

(a)  D,_i/_i  (which  is  the  predecessor  of  D,  /  on  the  diagonal  d)  is  e-\  and  a,  #  hi.  Or, 
D,/_i  (the  predecessor  of  D, j  on  row  /  which  is  also  on  the  diagonal  "below"  d)  is  e-\. 
Or,  D,_;  /  (the  predecessor  of  D,  ,  on  column  /  which  is  also  on  the  diagonal  "above"  d)  is 

e-I. 

(b)  D,_i  ;_i  is  also  e  and  a,  =  i>;. 

This  implies  that  we  can  start  from  D,  .  and  follow  its  predecessors  on  diagonal  d  by 
possibility  (b)  till  the  first  time  possibility  (a)  occurs. 

The  following  algorithm  "inverses"  this  description  in  order  to  compute  the  Z.^^'s.  Lj  ^._i, 
Lj-ig-i.  and  Z.^^;^_i  are  used  to  initialize  the  variable  row,  which  is  then  increased  by 
one  at  a  time  till  it  hits  the  correct  value  of  Lj  g. 
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The  following  algorithm  computes  the  1,^  ^' s 

Initialization       for  all  d,  0  ^d^  n+  \  ,Z.^_i:=  -1 
foralld,-{k^\)<d^-\    do 

Ld.d-ii  '■=  l^~l! 

2.  for  e:  =  0  to  k  do 

for  d:=-e  to  n  do 

3.  row  :=  ma.r[(L^.^_i+n,(Lj_i.^_i),(Z.j^i_^_i+-l)] 

4.  while  JroK-i  =  froH-i-d   <io 

row :  =  row  +  1 

5.  L^  g.=  row 

6.  if  Lj^  =  m  then 

prmt  -THERE  IS  AN  OCCURRENCE  ENDING  AT  t^^„* 


Remarks,  a)  For  every  /,/,  D,  i-D,_i  i_i  is  either  zero  or  one.  b)  The  values  of  the  matrix 
D  on  diagonals  d,  such  that  d>n-m+k+l  and  d<  -k  are  useless  for  the  solution  of  the 
^-differences  problem. 

Correctness  of  the  algorithm. 

Claim.  Lj  g  gets  its  correct  value. 

Proof  of  claim.  By  induction  on  e. 

Let  ^  =  0.  Consider  the  computation  of  L^  q,  id^O).    Instruction  3  starts  by  initializing  row 

to  0.    Instructions  4  and  5  find  that  aj ai     is  equal  to  r^+j tj+i     and  a^    _i^ 

tj+i    -J.    Therefore  L^  q  gets  its  correct  value. 

Let  e  =  l.  Assume  that  all  L^ij-i  are  correct.  (The  reader  can  easily  check  that  Z,_//_i  and 
L-i-ii-i  get  correct  values  m  the  Initialization  -  this  should  have  actually  been  part  of 
establishing  the  base  of  the  induction.)  Consider  the  computation  of  L^^,  id^-e). 
Following  Instruction  3  row .  is  the  largest  row  on  diagonal  d  such  that  0^0^.^^^^^^^.  can  get 
value  e  by  possibility  (a).    Then  Instruction  -  finds  L^  g. 

Complexity.    We  evaluate  Lj  /i  for  n^k-^\  diagonals.  For  each  diagonal  the  variable 
row  can  get  at  most  m  different  values.    Therefore,  the  computation  takes  0(mn)  nme. 
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2.3.   The  new  algorithm 

The  new  algorithm  has  two  steps: 

Step   I.     Concatenate    the    text   and   the   pattern   to   one    string    (ti f„«!ai.  •  ■  •  .«m)- 

Compute  the  suffixes  tree  of  this  string. 

A  methodological  remark.  Step  I  of  the  serial  algorithm  presented  here  combined  Parts  I 

and  II  of  the  parallel  algorithm  that  follows.  Step  II  corresponds  to  Part  III. 

Step  II.    Find  all  occurrences  of  the  pattern  in  the  text  with  at  most  k  differences. 

2.3.1.   Step  I. 

Let  us  define  the  suffixes  tree  of  a  string  C  =  ci cf. 

1)  It  is  a  tree  in  which  all  the  edges  of  the  tree  are  directed  away  from  the  root.  The  out 
degree  of  each  node  of  the  tree  is  either  zero  (if  the  node  is  a  leaf)  or  >  2. 

2)  Each  suffix  C;  =  c,^i,  .  .  .  ,C/  of  the  string  defines  a  leaf  of  the  tree.    (The  tree  has  / 

leaves.) 

3)  Let  C,  and  Cj  be  any  two  suffixes.    Suppose  c,_i,...,c,+y  is  their  longest  equal  prefix. 

That      is,      c,_i,  .  .  .  ,c,_y-     equals      to      c^_i c^-/     and      c,_/+i  =/=  <^j-/-i-      Then, 

c,.i c,_,-  defines  an  internal  node  (i.e.,  a  node  which  is  not  a  leaf)  of  the  tree.  Let  D 

be  a  successive  substring  of  the  string  C .  Let  S  be  a  proper  prefix  of  D .  Suppose  also  that 
both  D  and  B  define  nodes  of  the  tree.  Then  there  is  an  edge  connecting  the  nodes  of  D 
and  6  if  there  is  no  successive  substring  F  of  the  pattern  such  that  the  following  three 
condition  hold  at  once:  F  is  a  proper  prefix  oi  D ,  B  is  a  proper  prefix  of  F  and  F  defines  a 
node  of  the  tree. 

4)  The  substrings  of  two  sibling  edges  (edges  emanating  from  the  same  vertex  of  the  tree) 
cannot  have    identical  (nonempty)  prefixes. 

Upon  construction  of  the  suffixes  tree  we  require  that  for  each  node  v  of  the  tree  a 

successive  substring  c,_i c^+j  which  defines  it  will  be  stored  as  follows:  START{v):  =  i 

and  £.VD(v):=/. 

Remark.  Up  to  isomorphism  (of  graphs)  there  is  only  one  suffixes  tree  tor  a  given  string. 
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EXAMPLE.  Given  the  strina  ahahS    rhe  suffixes  tree  is: 


START{A)  =  1,  EXDiA)  =  2,  START(B)  =  3,  EiVD(B)=\.  START(C)  =  5.  E.VD(C)  =  0, 
STARTiD)  =  0.  £.VDiD)  =  4,  STARTiE)  =  2.  £/VD(E)  =  2,  START{F)=l,  E\D{F)  =  3, 
START(G)  =  3.  EXD{G)  =  \. 

The  suffixes  tree 

We   compute  the   suffixes  tree   of  the   string  fj !n'^<^h  ■  ■  ■  -^m   using  the   serial 

algorithm  of  Weiner  [\V-73].  Actually,  we  replace  each  character  of  the  text  which  does 
not  appear  in  the  pattern  by  the  same  special  character  Y  and  compute  the  suffixes  tree  for 
this  string. 

Complexity.  [W-73]  computes  the  suffixes  tree  in  Oin)  time  uhen  the  size  of  the  alphabet 
is  fixed.  This  is  also  the  running  time  of  Step  I  for  fixed  size  alphabet.  If  the  alphabet  of 
the  pattern  contains  x  letters  then  it  is  easy  to  adapt  this  algorithm  of  [W-731  and  Step  I  to 
run  in  time  0(«log,r).  In  both  cases  the  ^pace  requirement  of  Step  I  is  Oin).  (The  reader 
is  also  referred  to  [CS-85]  in  which  a  lucid  presentation  of  the  algorithm  of  [W-73]  is 
given). 

2.3.2.    Step  II. 

The  matrix  D  and  the  /.-•  /s  are  exactly  as  in  the  alternative  dynamic  programming 
algorithm.  We  use  this  alternate e  algorithm  with  a  very  substantial  change  Introducing 
this  change  m  the  present  step  of  the  algorithm  and  enabling  it  by  proper  preparation  in  the 
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previous  step  is  the  main  contribution  of  this  paper  in  both  the  serial  and  parallel 
algorithms.  The  change  is  in  Instruction  4.  where  instead  of  increasing  variable  row  by  one 
at  a  time  until  it  reaches  Ij^,  we  find  Lj^^  in  0(1)  time! 

For    a    diagonal   d.    the    situation    following    Instruction    3    is    that    we    matched    (with    e 

differences)  ai a^ow  of  the  pattern  with  some  substring  of  the  text  that  ends  at  frow^j- 

We  want  to  find  the  largest  q  for  which  0^0^,-1, ■■■■arow~q  equals  r^oH— a-i.---Jro>v-j-^-  Let 
LCA^g^^,  J  be  the  lowest  common  ancestor  (in  short  LCA)  of  the  leaves  of  the  suffixes 
rroH^i-.!,...  and  a^g^^i,...  in  the  suffixes  tree.  The  desired  q  is  simply  f.VD(Z.C.4^o„^). 
Thus,  the  problem  of  finding  this  q  is  reduced  to  finding  LCA^g,,.  j.  We  use  the  algorithm 
of  [HT-84]  for  the  purpose  of  computing  LCA's  in  the  suffixes  tree  when  ever  we  need  to 
find  such  a  q  throughout  the  algorithm. 

Complexity.  Using  the  classification  of  [HT-84]  we  are  interested  in  the  static  lowest 
common  ancestors  problem,  where  the  tree  is  static  but  queries  for  lowest  common 
ancestors  of  pair  of  vertices  are  given  on  line.  That  is,  each  query  must  be  answered  before 
the  next  one  is  known.  The  suffixes  tree  has  0(n)  nodes.  The  algorithm  of  [HT-84] 
proceeds  as  follows.  It  preprocesses  the  suffixes  tree  in  0{n)  time.  Then,  given  a>;7  LC.\ 
queries  it  responds  to  them  in  a  total  of  Oia)  time.  For  each  of  the  n-^k-\-  1  diagonals,  we 
evaluate  ^+1  L^/s.  Therefore,  we  have  0(kn)  LCA  Queries.  It  will  take  Oikn)  time  to 
process  them.  This  time  dominates  the  running  time  of  Step  II. 

Complexity  of  the  serial  algorithm.  The  total  time  for  the  serial  algorithm  is,  0{kn) 
time  for  alphabet  whose  size  is  fixed  and  0(nik  +  logm))  time  for  general  input. 

3.  Parallel  Algorithm  -  Finding  All  Occurrences  Of  The  Pattern  In 
The  Text  With  At  Most  k  Differences  (Part  III). 

This  section  is  devoted  to  the  last  part  of  the  parallel  algorithm.  The  presentation  will 
clarify  which  information  became  available  as  a  result  of  Parts  I  and  II.  The  matrix  D  and 
the  Z.J  ^"s  are  exactly  as  in  the  serial  algorithm.  Part  III  of  the  parallel  algorithm  employs 
n  +  k-^\  processors.  Each  processor  is  assigned  to  a  diagonal  J.  -k  <  J  <  a;  .  The  parallel 
treatment  of  the  diagonals  is  :he  source  of  parallelism  in  Part  III  of  our  new  algorithm. 
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For  a  diagonal  d  the  situation  following  Instruction  3  is  that  we  matched  (with  e 
differences)  ai,...,arow  of  the  pattern  with  some  substring  of  the  text  that  ends  at  trow-d- 
We  want  to  find  the  largest  q  for  which  a,„^^^,....a,ow^q  equals  t,^^^^^^,...j,^^.^^^^. 
In  the  serial  algorithm  we  got  this  q  from  the  suffixes  tree.  In  the  parallel  algorithm  we 
get  g  in  a  different  way.  We  use  two  kinds  of  information  from  the  previous  parts  of  the 
algorithm: 

a)  An    mdex   g   of  the   pattern    which   brmgs   /   to  a  maximum    in   the   following  match: 

trow-d-i-  ■  ■  ■  -^row-d-i    =  ^?-i <^g-i  ■    (There  is  no  larger  /  (and  g)  for  which  such  a 

match  holds.)  This  information  was  computed  in  Part  II  into  an  array  called  BEST- FIT 
(see  section  4). 

b)  The    length    /    of    the    longest    match    between    <3roiv^i,...    and    a^^i, That    is, 

a.g^.^i,  .  .  .  Mrow-f  -    -a^-i.  •  •  •  •^g*/    and    (^rovj-f^\  ^  ^g^/*i-     This    information    v\as 
computed  in  Part  I  into  array  MAX- LENGTH  (see  section  5). 

Observation.  The  desired  q  is  the  minimum  between  /  and  /.    proof  of  the  observation  is 
straightforward. 

We  use  the  parameter  d  and  the  pardo  command  for  the  purpose  of  guiding  each 
processor  to  its  instruction. 

Part  III  of  the  parallel  algorithm 

1.  Initialization  (as  above) 

2.  for  e:  =  0  to  k  do 

for  d:=-e  to  n  pardo 

3.  row  :=  ma.t[(Z.j_^_i+ 1) ,(Z.d_i^_i),  (Z,j^i_^_i+ 1)] 

4.  Lj  /.=  row  +  mini f, I) 

5.  if  L^  ^  =  m  then 

print  *THERE  IS  AN  OCCURRENCE  ENDING  AT  t^^„* 

Complexity  of  Part  III.  We  employs  «+A-+l  processors  (one  per  diagonal).  Each 
processor  computes  at  most  k+  1  L^  /s.  Obtaining  the  information  from  Parts  I  and  11  to 
compute  each  L^ ^  takes  0(1)  time.  Therefore.  Part  3  takes  0(k}  time  using  >i^k+\ 
processors.  Simulating  the  algorithm  by  /;  processors,  instead  of  n^k-^\  still  gives  Oik) 
time. 
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4.   Parallel  Algorithm  -  Computation  Of  Array  Best-Fit  (Part  II). 

In    this    section    we    compute    the    one    dimensional    array    BEST-FIT[0 n-\]. 

BEST-FIT{i)  =  (gJ)   means  that  f,-^i,...,f,^/  =  a^  +  i a^^,,  and  there   is  no   larger  /, 

such  that  there  exists  g,  for  which  such  a  match  holds.  In  this  case  we  denote 
BEST-FIT(i).\  =  g  and  BESJ-FITii) .2  =  /.  The  computation  relies  on  the  following 
information  which  was  gathered  in  Part  I  of  the  parallel  algorithm  (see  section  5j: 

1.  The  array  LOCATION [\,...,<^],  LOCATIOX(i)=f  means  that  character  /  in  the  alphabet 
appears  in  location  /of  the  pattern  {af=i  ). 

2.  The       three       dimensional       array       PAIR-FIT[0 m- l;0,...,m- 1.0 [logm]]. 

PAlR-FIT(j,l,i)  =  (\.f)  means  that  a^.i a^_,,  a,^i,...,ai^f  =  a^-i '^x-y-f  ^"*^ 

there  is  no  larger/,  such  that  there  exists  X,  for  which  such  a  match  holds. 

Step   1.  Using  array  LOCATION  find  for  each  character  in  the  text  a  location  in  the 
pattern  in  which   the  same  character  appears  (if  there  is  one). 

Note  that  Step  1  results  in  a  characterization  of  the  text  using  characters  of  the  pattern 
only.  Each  stage  of  Steps  2  and  3  refines  this  characterization  until  the  ultimate 
characterization  is  reached  and  entered  into  BEST -FIT. 

Steps  2  and  3  use  the  scheme  of  parallel  prefix  sum  computation  in  which  a  balanced  binary 
tree  guides  the  computation.  We  refer  the  reader  to  [V-84]  for  a  detailed  description.  (For 
an  earlier  reference  to  using  the  scheme  of  parallel  prefix  sum  computation  see  [FL-80].) 
The  balanced  binary  tree  is  defined  as  follows:  Each  pair  [/.;],  where  0  ^  /  :^  flog/j], 
0  <  ;  <  A2-1  and  ;  is  divisible  by  2',  defines  a  node  of  the  binary  tree  whose  left  son  is 
[i-\,j]  and  right  son  is  [/- 1  j +  2'"'']. 

Step  2.  The  computation  proceeds  in  [logm]  stages. 

The  output  of  stage  /,  1  <  /  <  flogm]:  Essentially,  it  is  the  same  as  the  input  of  stage  ;  +  !. 
For  each  ;,  0  -^  j  :<  n-\  which  is  divisible  by  2',  we  are  given  an  index  /  of  the  pattern 

such  that  r,_; r._„  =  a^_i,  .  .  .  ,a^_,,  ,  if  such  /  exists.    If  such  /  does  not  exist  we 

are  given  an    index  g  of  the  pattern  which  brings  /  to  a  maximum  in  the  following  match: 

The  relation  to  the  binary  tree  is  clear:  The  "active"  nodes  at  stage  /  are  of  the  form  [/.;]. 
The  input  that  such  a  node  needs  can  be  obtained  from  its  two  sons.    We  further  observe 
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(and  leave  it  to  the  reader  to  fill  in  the  details)  that  using  PAIR-FIT  a  single  processor  can 
perform  the  operations  corresponding  to  each  node  in  0(1)  time.  We  refer  to  the 
computation  at  each  node  as  an  operation. 

Observe  that  the  computation  starts  at  the  leaves  and  moves  towards  the  root  of  the 
balanced  binary  tree  which  guides  the  computation  until  it  reaches  the  nodes  of  the  tree 
whose  distance  from  the  leaves  is  [logm].  The  tree  has  n  leaves  and  therefore  0(n) 
nodes.  The  flogm]  stages  use  only  the  flo?"'!  lower  levels  of  the  tree.  Hence,  this 
description  implies  a  total  of  Oin)  operations  and  Oilogm)  time. 

Step  3  consists  also  of  flogm]  stages.  Essentially,  they  amount  to  reversing  the  flog^n] 
stages  of  Step  2.  That  is,  the  computation  of  Step  3  starts  at  nodes  of  the  tree  whose 
distance  from  the  leaves  is  flogm]  and  ends  at  the  leaves. 

Step     3.     The     computation     proceeds     in      flogm]     stages.     We     describe     stage 
i,  1  <  I  <   flogm].    Let  8:=  flogm]-/. 
The  input  which  is  relevant  to  stage  i: 

a)  For  each  j,  0  ^  j  ^  n—l.  which  is  divisible  by  2*~  .  we  are  giverl  an  index  g  of  the 
pattern  which  brings  /  to  a  maximum  in  the  following  match: 
r^_i,  .  .  .  jj^i  =  a^-i,  .  .  .  Mg-i  ■  (That  is,  we  have  already  found  that 
BEST-FIT(j)  =  {gJ).) 

b)  For  each  h.  0  ■^  h  ^  n—  \  ,  which  is  divisible  by  2*  but  is  not  divisible  by  2*~\  we  are 

given  an   index  /  of  the  pattern  such  that  r;,^i '';,-->»  ~   '^f~\ '^^_-ve  -  'f  such  / 

exists.    If  such  /  does  not  exist  we  are  given  an  index  g  of  the  pattern  which  brings  /  to  a 

maximum  in  the  following  match:  t^  +  i /;,  +  ,  =  (^g^i-  ■  ■  ■  .<2g*;;  (observe,  that  in  this 

case  (g,l)  is  already  the  desired  output  for  BEST- FITih)). 

The  result  of  stage  i: 

For  each  h,0  ^  h  ^  n—l   ,  which  is  divisible  by  2''  but  is  not  divisible  by  2*"^  we  find 

BEST- FITih) .    That  is,  we  want  to  find  an    index  g  of  the  pattern  which  brings  /  to  a 

maximum   in   the   following   match:   t-^-i,  .  .  .  ,/;,_;  =  tJj_i a^^i   .    This  computation 

takes  place  at  node  [5,^]  of  the  tree.  Finally,  we  observe  that  using  PAIR- FIT  a  single 
processor  can  carry  out  the  computation  required  at  each  node,  in  0{  1 )  time. 

Similar  to  Step  2,  these  logm  stages  require  a  total  oi  Gin)  operations  and  Oilogm}  time. 
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Complexity  of  Part  II.  We  had  a  total  of  0(n)  operations  and  O(logm)  time.  No  difficulty 
will  arise  in  applying  the  simulation  scheme  due  to  [Br-74].  Applications  of  this  scheme  for 
similar  purpose  were  given  in  a  few  parallel  algorithms.  For  instance,  see  [V-85b].  This 
will  result  in  Oilogm)  time  using  «/logm  processors. 

5.   Parallel  Algorithm  -  Analysis  Of  The  Pattern  (Part  I). 

5.1.  Computation  Of  Array  Max-Length 

The    input    is   the    pattern    ai a„.     The    output    is    the    two    dimensional   array 

MAX-LENGTH[0 m-\;0 m-\].  M. \X- LENGTH  (i,j)^f  means  that 

a,_i  ,  .  .  .  ,  a,-r/  =    ^j^i ^j^f^  ^"'^  ^/-/-i-i  "^  <^j~f^i-    In  words,  consider  laying  the 

suffi.x  of  the  pattern  starting  at  <3,_i  over  the  suffix  of  the  pattern  starting  at  <3^_i. 
MAX -LENGTH  a  J)  is  the  longest  match  of  prefixes  between  these  two  suffixes. 

The  pair  (/,;)  ,0:s/,;<w-l  is  defined  to  be  on  diagonal  d  of  MAX- LENGTH  if 
j  —  i  =  d.  where  possible  values  of  d  are  -(m-\}  <  d  ■£  m—\.  It  is  easy  to  see  that: 
MAX-LENGTH(ij)  =  MAX-LENGTH{i+ \.j  +  \)  + \  if  a,.i  =  a^_i  and 

MAX-LENGTH(iJ)  :=  0    otherwise.    W.=  conipnte  MAX-LENGTH  in  two  steps: 

a)  Initialization:  for  each  pair  (i.ji  0  s/,;^  m-\  M AX- LENGTH d.])  :=  1  if  a,_i  =  a^,_i 
2Ln^  MAX -LENGTH  a.])  :=  0     otherwise. 

b)  Using  a  parallel  prefix  sum  computation  we  compute  the  values  of  each  diagonal  d  of 
MAX- LENGTH  separately.  (We  compute  the  sum  till  the  first  0  and  not  till  the  bottom  of 
the  diagonal.) 

Complexity.  In  step  a  we  have  Oim')  operations  and  0(1)  time.  In  step  b  we  have  a  total 
of  0(m)  operations  and  Oilogm)  time  per  diagonal.  Applying  the  simulation  scheme  due  to 
[Br-74]  will  result  in  O(logm)  time  using  m'/]ogm  processors. 

5.2.  Computation  Of  Array  Pair-Fit 

The    input   is   the    pattern   a^ a^.     The    output   is    the    three   dimensional   array 

PAIR  -  FIT[0,. . .  jn-  \:0,. . .  .m  -  \;0 [logm]]. 

PA!R-FIT(j  J.h  =  l\.j')    means  that  a^_; a  _ -.a;_i,...,a;^.-  =    a)^^i o^^^i.^f  and 

there  is  no  larger  /  i  and  k)  for  which  such  a  match  holds.  In  words,  concatenate  to 
a. -I ^.^•.  'he  suffix  of  the  pattern  starting  at  c;;-!.    .As  a  result  we  get  the  concatenated 
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string  aj^i,....a ^^,ai^i a„.    We  are  interested  in  a  suffix  of  the  pattern  wiiose  prefix 

lias  a  longest  match  with  a  prefix  of  the  concatenated  string  and  flx,...  is  such  a  suffix. 

Step  1.  Construction  of  the  suffixes  tree. 

The  suffixes  tree  as  it  is  described  in  section  2  has  m  leaves  and  therefore  it  has  <m 
internal  nodes.  However,  the  number  of  pairs  of  suffixes  is  Q(m~)  which  is  much  larger. 
Step  1. 1  results  m  a  set  of  at  most  m  — 1  pairs  which  provide  all  internal  nodes  (as  implied 
by  Observation  (1)  below).  Observations  (2)  and  (3)  enable  us  to  use  the  same 
computation  to  find  two  things:  (a)  a  minimal  set  of  pairs  which  provides  all  internal  nodes, 
and  (b)  for  each  node,  its  father  in  the  tree. 

Step  1.1.  Sort  the  m  suffixes  of  the  pattern.  (A  comparison  between  two  suffixes  is 
performed  in  0(1)  time  as  follows.  MAX-LENGTH  gives  the  longest  match  of  prefixes 
between  the  two  suffixes.  This  implies  the  index  of  the  leftmost  character  in  which  the  two 
suffixes  differ.  So,  compare  the  characters  that  have  this  index  in  both  suffixes.)  .A.ny 
parallel  sorting  algorithm  which  is  based  on  comparisons  can  be  used  here.  We  assume  of 
course,  some  total"  order  on  the  characters  of  the  pattern:  In  the  computer,  each  character 
is  given  by  a  binary  representation.  We  can  use  the  natural  order  on  these  binary 
representations. 

Let  So,5i fi-n-i  be  the  sorted  vector  of  suffixes  which  is  obtained  in  Step  1.1.    Let/,  be 

the  length  of  longest  equal  prefix  of  fl,-  and  S,_i,  (  0  <  /  <  m- 1).  Note  that,  the  /,  values 
can  be  looked  up  in  MAX-LENGTH.  Step  1.2  below  computes  the  suffixes  tree  from  the 
/,-  "s  using  the  following  observations. 

Observations.  (1)  Let  v  be  an  internal  node  of  the  suffixes  tree.  Suppose  v  is  defined  by 
the  longest  equal  prefix  of  two  suffixes  B,  and  Bj.  Then,  there  exists  _v,  0  s  y  </n-l,  such 
that  the  longest  equal  prefix  of  By  and  Sv_i  also  defines  v.  (For  short,  we  then  say  that  v  is 
the  node  of /^).  Observation  (1)  says  that  every  node  of  the  tree  is  a  node  of  some  /,-. 
This  implies  that  it  is  sufficient  to  consider  the  nodes  of  the  /,'s  only  for  computmg  the 
suffixes  tree.  Observation  (2)  characterizes  the  situation  where  the  nodes  of  several /,"s  are 
the  same. 

(2)  Suppose  that  for  some  0  <  /  <  ;  <  ^?!  -  1 .  /^/V  and  for  all  /  <  y  <  ;,  /,,  >  /,.  Then 
the  nodes  of/,  and  f.  are  the  same.  This  implies  that  we  can  dispense  with  /,  m  the 
minimal  set  of/,  values  which  define  internal  nodes.    We  identify  the  internal  nodes  of  the 
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suffixes  tree  with  the  /^  values  that  have  not  been  dispensed  with. 

(3)  Let  0  <  /z  </<;■<  m-  1  and  suppose  /,  >  /^  ,  /,  >  /^  and  for  all  h  <  y  <  j,  /,  <  /^. 
Then,  the  node  of  the  maximal  among/;,  and /^  is  the  father  of  the  node  of/,.  (Note  that  if 
ff,  =  fj,  then,  by  observation  (2),  their  nodes  are  the  same). 

Step    1.2.     Consider    the    interval    of   integers    [0,1 m-2].     We    define    the    singleton 

subintervals    [0],[l],...,[m-2]    to    be    the    subintervals    of   level    0.    The     \m/2]    size-two 

subintervals  [0,1], [2, 3], [4, 5],...  are  said  to  be  the  subintervals  of  level  1.  The  m/2'  size- 
2'  subintervals  [0....2'-l]  [2',...,2.2'- 1],...  are  said  to  be  the  subintervals  of  level  /.  All 
these  0(m)  subintervals  are  called  the  power-two  subintervals  of  [0,...,m-2]. 

1.2.1.  For  each  power-two  subinterval  find  the  minimum  /,  over  the  subinterval.  We 
implement  this  computation  using  a  balanced  binary  tree  in  the  obvious  way.  This  takes 
O(logm)  time  using  m/logm  processors. 

1.2.2.  For  each  0  ^  i  <  m-\.  find  the  largest  h<i  such  that  ff,—f,  (if  such  h  exists). 
Denote  this  h  by  h(i).  We  implement  this  computation  by  assigning  a  processor  to  each 
such  i.  The  processor  finds  hii)  by  a  kind  of  binary  search  on  the  power-two  subintervals 
in  time  Oilogm ). 

If fi>fh{i)  cf  if  /!''•  <ioes  not  exist,  we  conclude  that  /,  belongs  to  the  minimal  set  of/  and 

identify/,  with  a  node  of  the  suffixes  tree. 

U  fi  =  fh(i),  we  conclude  that  /,  does  not  belong  to  this    minimal  set.    We  say  that  /,  is  a 

non-node.   We  compute  the  node  /a(,j  which  is  the  node  which  is  defined  by  non-node  /  as 

follows. 

1.2.3.  For  each  non-node  /,  we  assign  a  processor  which  performs  a  binary  search  on  the 
power-two  subintervals  to  find  the  smallest  a</  such  that  /,  =  /a  and  for  all  a  <  y  <  /, 

fy   ^  /,- 

1.2.4.  For  each  0  ^  /  <  m-\,  find  the  smallest  j>i  such  that  /.</,  (if  such  j  exists). 
Denote  this  ;  by  jii).  We  implement  this  computation  in  the  same  way  as  in  1.2.2  above. 
So.  using  m  -  1  processors  it  takes  0{  logm  j  time. 

1.2.5.  For  each  node  of  the  tree,  find  its  father. 

For  each  internal  noae  /.,  take  the  maximum  value  between  /),f,i  and  /,(,,.  Let  3(/i  be  h(i\ 
if//,,,)  provides  the  maximum  and  let  it  be  _;(;)  otherwise.  By  Observation  (3i,  the  node  of 
/3(.,  is  the  father  oi  node  /  ;n  the  suffixes  tree.  Observe  that  /3J,;  may  be  a  non-node. 


Ultracomputer  Note  101 


Page  16 


However,  in  1.2.2  above  we  found  the  node  f^u)  which  is  the  node  defined  by/,.    So,  using 

a  processor  per  node  we  can  find  its  father  in  0(1)  time. 

Each  suffix  of  the  pattern  is  a  leaf  of  the  tree.  Consider  suffix  B,  in  the  output  of  1.1. 

Using  the  notations  of  1.2.5  for  internal  nodes,  h(i)  is/,_i  and  ;(/)  is/,.  Finding  the  father 

of  B,  is  now  similar  to  finding  the  father  of  an  internal  node. 

Below  we  use  the  Euler  tour  technique  of  [TV-85]  and  [V-85a]  which  computes  various 

functions  on  tree.  The  description  below  has  been  taken  from  [V-85a]. 

Step  2.  Find  an  Euler  path  in  the  suffixes  tree  that  starts  and  ends  at  the  root  (say  r)  and 
compute  for  each  node  in  the  tree  the  length  of  the  (shortest)  path  leading  from  the  node  to 
the  root,  to  be  called  the  level  of  the  node.  The  input  to  Step  2  is  7"  the  suffixes-tree  which 
was  computed  in  Step  1. 

2.1  Finding  an  Euler  path. 

2.1.1  Replace  each  edge  (u,v)  of  7"  by  two  anti-parallel  directed  edges  u^v  and  v^u  to 

form  a  digraph  called  T. 

2."t.2  For  each  node  v  of  T  we  do  the  following.  (Let  degree(\')  =t/  in  T  and  let  the  d 

adjacent  edges  of  r  in  T  be  {v.ui) ,...,(v,u^) .) 

Diu,^v):  =  v^itj^i^^jj  for  ls.i<d.    Now  D  has  an  Euler  cycle. 

The  'correction".  Diu^-'r):=' end  of  list'  (where  d  =  degree(r))  gives  an  Euler  path 

that  starts  and  ends  at  r. 

2.2  Finding  for  each  node  its  level  in  T . 

2.2.1  The  distance  of  each  edge  of  T  from  the  end  of  the  path  is  computed  into  a 
vector  R  using  a  'doubling'  procedure. 

Initialize:  Rie)  :=  -  1  for  all  edges  ^  in  f  which  are  directed  towards  the  root  and 

R{e):=\  for  all  the  edge  ^  in  T  which  are  directed  away  from  the  root.    Also 

Ri'end  of  list' )  =0. 

Apply    \log2(m-\)]    iterations   in   parallel  {(m-Y)   is  the  length  of  the  Euler 

path): 

R(e):  =  R(e)^R(D(e),    Di  e):  =  D(Die)). 

2.2.2  The  doubling  procedure  assigns  to  Rie).  of  each  edge  e  (//-v)  which  is  directed 
towards  the  root  m  T .  the  level  of  u  in  7". 
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Step    3.    We    show    how    to    find    for    each    suffix    of   the    pattern    Aj    and    each    integer 
/,  0  ^  r  ^   flogml,  all  values  of  PAIR-FITij ,1  ,i).  So,  below,   ;  and  i  are  fixed. 
Assign  a  processor  to  each  suffix  A^  .  Using  MAX-LENGTH  the  processor  checks  whether 
a  prefix  of  length  2'  of  Aj  is  a  prefix  of  A^.  If  yes,  "mark"  (the  leaf  of)  suffix  A  ^-,,.  As  a 

result  some  of  the  leaves  of  the  tree  are  marked.  Recall  that  our  goal  is  to  compute  into 
PAIR-FlT(i,l,i)  (for  each  /.Os/<m-n  the  values  A.  and  /  such  that 
aj^l,...,a  ^2''^i  +  i'---^'^i~f  ~  '^x+i-  ■  ■  ■  -^x^^'^f  ^"d  there  is  no  larger  /  (and  k)  for  which 
such  a  match  holds.  If  /  is  a  marked  leaf  then  we  simply  select  X.  :=  /-2'  and  f:  =  m-l. 
Otherwise,  to  maximize  f  we  have  to  find  a  marked  leaf  such  that  the  level  of  LC Ail, "the 
marked  leaf )  is  as  large  as  possible. 

Observations .  Suppose  leaf  /  is  not  marked.  Consider  the  Euler  path  on  T  and  the  location 
of  leaf/  in  this  path.  (The  incoming  edge  of/  is  followed  by  the  outgoing  edge  of/.  We  say 
that  the  location  of  leaf  /  is  "between"  these  two  edges.)  Let  a  be  the  last  marked  leaf 
which  precedes  /  in  the  Euler  path  and  let  (3  be  the  first  marked  leaf  which  succeeds  /  in 
the  Euler  path.  Then, 

(1)  either  a  or  3  (or  both)  can  provide  the  desired  .\. 

(2)  the  portion  of  the  Euler  path  leading  from  a  ro  /  visits  LCA(a.l)  ;  moreover,  the  node 
LCA(ad)  provides  the  minimal  level  in  this  path.  Similarly,  the  portion  of  the  Euler  path 
leading  from  /  to  P  visits  LCAi^J)  ;  moreover,  the  node  LCAi^J)  provides  the  minimal 
level  in  this  path. 

Observation  (2)  explains  the  rest  of  the  computation  of  Step  3  below: 

3.1  Using  a  parallel  prefix  sum  computation  find  for  each  edge  in  the  Euler  path  the  node 
of  minimal  level  which  appears  between  the  edge  and  its  preceding  marked  leaf  in  the 
Euler  path.  This  is  done  as  follow: 

a)  Divide  the  Euler  path  into  subpath  by  throwing  away  all  the  edges  from  or  to 
marked  leaves. 

b)  Handle  each  subpath  separately.  Let  each  edge  e  [u^v)  be  a  leaf  in  a  balanced 
binary  tree.  Its  initial  value  w,ll  be  the  level  of  ;/  (computed  in  Step  2.2). 

c)  Use    the  parallel  prefix  sum  computation  ^  with  one  change  -  replace  the  operation 
sum  Dv  the  operation  minimum)  to  finish  the  computation. 

Ultracomputer  Note  101  Page  18 


[Br-74]  R.P.    Brent,    "The    parallel    evaluation    of   general   arithmetic   expressions'", 

JACM  21,2  (1974),  201-206. 

[BM-77]  R.S.     Boyer  and  J.S.     Moore,  "'A   fast  string  searching  algorithm",  Comm. 

ACM  20(1977),  762-772. 

[CS-85]  M.     T.     Chen     and     J.     Seiferas.     "Efficient     and     elegant     subword     tree 

construction",  in  .\.  Apostolico  and  Z.  Galil  (editors).  Combinatorial 
Algorithms  on  Words,  NATO  ASI  Series,  Series  F:  Computer  and  System 
Sciences,  Vol.  12,  Springer-Verlag,  1985,  97-107. 

[FL-80]  M.  Fisher  and  L.  Ladner,  "Parallel  prefix  computation".  JACM  27,4     ( 1980), 

831-838. 

[G-84]  Z.    Galil,  "Optimal  parallel  algorithms  for  string  matching",  Proc.  16th  ACM 

Symposium  on  Theory  of  Computing,  1984,  240-248. 

[G-85]  Z.    Galil,   "Open  problems   in   strmgology",   in   A.   Apostolico  and  Z.  Galil 

(editors).  Combinatorial  .A.lgoriLhms  on  Words,  NATO  ASI  Series,  Series  F: 
Computer  and  System  Sciences,  Vol.  12,  Springer-Verlag,  1985,  1-8. 

[GS-83]  Z.     Galil    and    J.i.      Seiferas.    "Time-space-optimal    string    matching",    J. 

Computer  and  System  Sciences  26  (1983),  280-294. 

[HT-84]  D.   Harel  and   R.E.   Tarjan,   "Fast  algorithms   for   finding  nearest  common 

ancestors".  SIAM  J.  Computing  13.2  (1984),  338-355. 

[1-85]  A.G.  Ivanov  "Recognition  of  an  approximate  occurrence  of  words  on  a  turing 

machine  in  real  time".  Math.  USSR  Izvestiya,  Vol.  24(1985),  No.  3,  479-522. 

[KMP-77]  D.E.  Knuth,  J.H.  Morris  and  V.r.  Pratt,  "Fast  pattern  matching  in  strings", 
SIAM  J.  Comp.  6  (  1977),  322-350. 

[KR-80]  R.M.  Karp,  and  M.O.  Rabin,  "Efficient  randomized  pattern-matching 
algorithms",  manuscript.  1980. 

[LV-85a]  G.M.  Landau  and  U.  Vishkin  "Efficient  string  matching  in  the  presence  of 
errors".  Proc.  26  IEEE  FOCS,  1985.  126-136.  This  is  a  preliminary  version  of 
[LV-85b]  and  [LV-85cj. 

[LV-85b]  G.M.  Landau  and  U.  \'ishkin  ""Efficient  string  matching  with  k  mismatches", 
Theoret.  Comput.  Sci..  to  appear. 

Ultracomputer  Nole  101  Page  20 


3.2  Using  a  parallel  prefix  sum  computation  find  for  each  edge  in  the  Euler  path  the  node 
of  minimal  level  which  appears  between  the  edge  and  its  succeeding  marked  leaf  in  the 
Euler  path  (similar  to  3.1). 

3.3  Determine  for  each  /  whether  it  preceding  marked  leaf  or  succeeding  marked  leaf  will 
be  selected  for  A.. 

3.4  Using  a  computation  which  is  similar  to  3.1  and  3.2  above,  find  for  each  edge  in  the 
tree  its  preceding  marked  leaf  and  its  succeeding  marked  leaf.  (Each  edge  needs  actually 
one  of  these  data,  as  implied  by  3.3.  However,  the  computation  is  much  easier  if  we 
compute  both  for  all  edges.) 

3.5  Using  3.3  and  3.4  determine  k  for  each  non  marked  leaf.    If  v  is  the  node  of  mmimal 
level  between  \  and  /  then /:  =  E.VD(\■)• 
Comp!exity.   Step  3  takes  O(logm)  time  using  m/logm  processors  for  each  of  the  m 

values  of  j  and  flogm]  +  1  values  of  ;.  Therefore,  we  get  O(logm)  time  using  m' 
processors.  It  is  easy  to  achieve  the  same  time-processor  bound  for  Steps  1  and  2. 

5.3.   Computation  Of  Array  Location 

The  input  is  the  pattern  a;,  .  .  .  Mm.    (We  assume    that  the  alphabet  of  the  characters 
that  can  be  used  in  the  pattern    is  l,...,p,  for  some    p.)  The  output  is  an  one  dimensional 

array  LOCATION[\ 3].    L0C/4r/0iV(z')=/ means  that  character  /  in  the  alphabet  appears 

in  location  /  of  the  pattern  (fi^=/ j. 

The  computation  of  the  array  LOCATION  is  straightforward. 

Complexity.     O(logm)   time   using   m   processors.   The   time   can   be   reduced   to   0(1)    if 

simultaneous  writes  by  several  processors  into  the  same  memory  location  are  allowed. 

Complexity  of  Part  I.    O(logm)  time  using  m"  processors. 

Acknowledgement.  We  are  grateful  to  Z.  Galil  for  encouragmg  us  to  continue  improving 
our  results. 
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